Identifying algorithmically generated domains

ABSTRACT

Examples relate to identifying algorithmically generated domains. In one example, a computing device may: receive a query domain name; provide the query domain name as input to a predictive model that has been trained to determine whether the query domain name is an algorithmically generated domain name, the determination being based on syntactic features of the query domain name, the syntactic features including a count of particular character n-grams included in at least a portion of the query domain name, where n is a positive integer greater than one; and receive, as output from the predictive model, data indicating whether the query domain name is algorithmically generated.

BACKGROUND

Computer networks and the devices that operate on them often experienceproblems for a variety of reasons, e.g., due to misconfiguration,software bugs, and malicious network and computing device attacks.Detecting and preventing the use and spreading of malicious software,for example, is often a priority for computer network administrators.Malicious software is increasingly designed to avoid detection usingrelatively sophisticated methods.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description references the drawings, wherein:

FIG. 1 is a block diagram of an example computing device for training amodel to identify algorithmically generated domains.

FIG. 2 is an example data flow for identifying algorithmically generateddomains.

FIG. 3 is a flowchart of an example method for identifyingalgorithmically generated domains.

DETAILED DESCRIPTION

Domain name generation algorithms (DGAs) may be implemented to producerandom, or pseudo-random, domain names, e.g., often for temporary use byboth legitimate and malicious entities. For example, a content deliverynetwork may use algorithmically generated domains to provide varioustypes of content, and malicious software (“malware”) may usealgorithmically generated domains to avoid detection. Among other uses,the ability to identify when a domain name is algorithmically generated,and to identify which domain name generation algorithm is used, mayfacilitate the detection and prevention of malware infections.

Domain name system (DNS) queries are a type of network traffic generallyproduced by a computing device operating on a computer network; the DNSqueries include data specifying a domain name and are addressed to a DNSserver device for domain name resolution. The DNS server typicallyprovides an IP address associated with the query domain name in responseto the DNS query, e.g., a computing device that issues a DNS query for“www.example.com,” may be provided with a response, from a DNS server,indicating the IP address associated with the “www.example.com,” e.g.,“123.456.789.012.” While DNS queries may be produced by computingdevices for many non-malicious purposes, some malware may use DNSqueries for malicious purposes.

As noted above, malware may make use of a DGA to periodically generatedomain names to avoid detection, e.g., randomly generated domains can beused by malware command and control servers to provide infectedcomputing devices with updates and/or commands. Malware makes use ofDGAs, as opposed to static domains, to prevent the malware command andcontrol servers from being blacklisted. An infected computing device mayperiodically attempt to reach out to a large number of randomlygenerated domain names, only a portion of which are registered tomalware command and control servers. A network administrator's abilityto detect a computing device that is using a DGA to generate a largenumber of randomly generated domain names may facilitate theidentification of infected computing devices on the administrator'snetwork. In addition, detecting the use of a DGA while network packetsare in transit may allow administrators to block malicious, orpotentially malicious, traffic.

In particular, a domain name analysis device may inspect DNS querypackets sent from a client computing device. Each query domain name maybe provided to a trained predictive model that has been trained todetermine whether a domain name is algorithmically generated and, insome implementations, to identify the DGA used to generate the domainname. The predictive model may be trained to identify DGA use based on avariety of domain name features, such as a count of particular bigrams,a binary feature set indicating character positioning, and number andtype of syntax violations, to name a few. When the use of a DGA isdetected, the domain name analysis device may take action designed toensure that appropriate remedial measures are taken with respect to theclient computing device that produced the algorithmically generateddomain, e.g., by notifying a network administrator, an entity thatmanages the client computing device, or another party or deviceresponsible for handling potential security threats. Further detailsregarding the training and use of the predictive model for identifyingalgorithmically generated domains is described in further detail in theparagraphs that follow.

Referring now to the drawings, FIG. 1 is a block diagram of an examplecomputing device 100 for training a model to identify algorithmicallygenerated domains. Computing device 100 may be, for example, a servercomputer, a personal computer, a mobile computing device, or any otherelectronic device suitable for processing data. In the embodiment ofFIG. 1, computing device 100 includes hardware processor 110 andmachine-readable storage medium 120.

Hardware processor 110 may be one or more central processing units(CPUs), semiconductor-based microprocessors, and/or other hardwaredevices suitable for retrieval and execution of instructions stored inmachine-readable storage medium 120. Hardware processor 110 may fetch,decode, and execute instructions, such as 122-126, to control theprocess for training a model to identify algorithmically generateddomains. As an alternative or in addition to retrieving and executinginstructions, hardware processor 110 may include one or more electroniccircuits that include electronic components for performing thefunctionality of one or more of instructions.

A machine-readable storage medium, such as 120, may be any electronic,magnetic, optical, or other physical storage device that contains orstores executable instructions. Thus, machine-readable storage medium120 may be, for example, Random Access Memory (RAM), an ElectricallyErasable Programmable Read-Only Memory (EEPROM), a storage device, anoptical disc, and the like. In some implementations, storage medium 120may be a non-transitory storage medium, where the term “non-transitory”does not encompass transitory propagating signals. As described indetail below, machine-readable storage medium 120 may be encoded with aseries of executable instructions: 122-126, for training a model toidentify algorithmically generated domains.

A domain name storage device 130 is in communication with the computingdevice 100 to provide the computing device 100 with domain names, e.g.,domain names 132 and 134. The domain name storage device 130 may be anycomputing device, such as one similar to the computing device 100, andmay include or have access to any number of storage mediums, similar tomachine-readable storage medium 120. While the implementation depictedin FIG. 1 shows the domain name storage device 130 as the only source ofdomain names, the computing device 100 may receive domain names from avariety of sources.

As shown in FIG. 1, the computing device 100 executes instructions (122)to receive a first set of domain names 132, and each domain name in thefirst set is a domain name that was previously identified as valid. Forexample, a list of the most popular and/or most visited websites, and/orwebsites included on a whitelist of websites known to be non-malicious,may be used to identify valid domains included in the first set ofdomain names 132.

The computing device 100 executes instructions (124) to receive a secondset of domain names 134, and each domain name in the second set is adomain name that was previously identified as algorithmically generated.For example, a list of algorithmically generated domains known to beused by various types of malware may be obtained, e.g., from variousanti-malware organizations. In some implementations, known DGAs may beused to generate domain names to be included in the second set of domainnames 134.

The computing device 100 executes instructions (126) to train, using thefirst and second set, a predictive model to identify a given domain nameas a valid domain name or an algorithmically generated domain name basedon syntactic features. The syntactic features include a count ofparticular n-grams included in at least a portion of the query domainname, where n is a positive integer greater than one. For example, thenumber of non-English bigrams, e.g., occurrences of character pairs thatdo not exist in English words, in a second level domain may be asyntactic feature used to train the predictive model. While, in someimplementations, syntactic features of whole domains, e.g.,“www.example.com,” may be used to train the predictive model, in someimplementations portions of a domain may be used, such as the prefix,e.g., “www,” the second level domain, e.g., “example,” and/or the toplevel domain, e.g., “com.”

A variety of syntactic features may be used to train the predictivemodel. For example, syntactic features of a domain name used to trainthe predictive model may include one or more of the following: the toplevel domain, a binary vector of languages associated with the top leveldomain, the a binary vector indicating the length of the second leveldomain, a scalar length of the second level domain, a binary vector ofthe length of the prefix, a scalar length of the prefix, a vectorindicating a count of each of the “A-F” characters in the second leveldomain, a vector indicating a count of each of the “G-Z” characters inthe second level domain, a vector indicating a count of each of the“a-f” characters in the second level domain, a vector indicating a countof each of the “g-z” characters in the second level domain, a count ofEnglish consonants in the second level domain, a count of English vowelsin the second level domain, a count of “0-9” digit characters in thesecond level domain, a count of capital foreign letter characters in thesecond level domain, a count of lower case foreign letter characters inthe second level domain, a count of dots, dashes, and underscores in thesecond level domain, a count of other printable ISO-8859-1 characters inthe second level domain, a count of non-printable ISO-8859-1 charactersin the second level domain, a vector indicating a count of eachISO-8859-1 character, a count of each valid character pair (2-gram) inthe second level domain, a count of each valid character triple (3-gram)in the second level domain, a binary feature vector for the prefix andsecond level domain by position—where each binary value of the featurevector corresponds to a character and character position, a Booleanfeature indicating instances of puny encoding, a Boolean vector of RFC1034 syntax violations, and/or a count of English words in a domainname. As noted above, other syntactic features, including combinationsof syntactic features and variations of the foregoing features, may beused to train the predictive model.

The type of predictive model trained may vary. Example predictive modelsmay include linear classifiers, decision trees, support vector machines,and nearest neighbor classifiers, to name a few. The predictive modelmay be trained to accept all or a portion of a domain name as input andproduce, as output, an indication of whether a domain name isalgorithmically generated and, in some implementations, which DGA wasused to generate the domain name.

In some implementations, the computing device 100 receives a querydomain name, provides the query domain name to the predictive model asinput, and receives, as output from the predictive model, a predictionspecifying that the query domain name is either i) a valid domain name,or ii) an algorithmically generated domain name. For example, the querydomain name may be extracted from a DNS query packet received by thecomputing device 100. Data indicating whether the query domain name wasdetermined by the predictive model to be algorithmically generated may,in some implementations, be provided to a third party computing device,e.g., an administrator computing device, a client computing device,and/or a network security pipeline for handling potential securitythreats.

In some implementations, the computing device 100 receives a third setof domain names, and each domain name in the third set is a domain namethat was previously identified as being generated by a particular DGA.For example, the particular DGA may be used to generate the domain namesof the third set, which may then be provided to the computing device100. Using the third set, a second predictive model may be trained todetermine whether a particular domain name was generated by theparticular DGA, and the determination may again be based on at least oneof the syntactic features. In this situation, the computing device 100may be able to identify use of the particular DGA, in addition to beingable to generally detect DGA use.

The predictive model or models may be trained to produce a variety oftypes of output. In some implementations, a single predictive model maybe trained to produce, as output, one of a plurality of DGAs used toproduce the domain name provided as input, e.g., without first making aseparate determination that the domain name is algorithmicallygenerated. In some implementations, separate predictive models may betrained for each DGA, and domain names received by the computing device100 may be provided to each predictive model until a positiveidentification of particular DGA use is determined. In someimplementations, one or more predictive models may be trained toprovide, as output, a measure of confidence that the domain nameprovided as input is algorithmically generated and/or a measure ofconfidence that the domain name provided as input was generated by aparticular domain name. For example, a predictive model may be trainedto provide, as output, a list of DGAs and, for each DGA, a measure oflikelihood that the domain name provided as input was generated by theDGA. Further examples and details regarding the identification ofalgorithmically generated domains is provided in the paragraphs thatfollow.

FIG. 2 is an example data flow 200 for identifying algorithmicallygenerated domains. The data flow 200 depicts a domain name analysisdevice 220, which may be implemented by a computing device, such as thecomputing device 100 described above with respect to FIG. 1. The clientdevice 210 may be any computing device suitable for networkcommunications, such as a personal computer, mobile computer, virtualmachine, server computer, or any combination thereof. For example, theclient device 210 may be a virtual machine operating within a privatecloud computing environment that uses the domain name analysis device220 to provide security for the private cloud network.

During its operation, the client device 210 may periodically communicateusing various network communications protocols. DNS queries are one formof network communications that may originate from the client device 210,e.g., in the form of a DNS query packet 212. Each DNS query packet 212is addressed to a DNS server which will perform domain name resolutionon a particular domain name. For example, in a situation where theclient device 210 implements an e-mail application, a DNS query packetmay be issued to identify the destination for an email addressed to“user@example.com.”

DNS query packets, such as the DNS query packet 212, may be routedthrough or otherwise provided to the domain name analysis device 220.The domain name analysis device 220 extracts the domain name 222 fromthe DNS query packet 212 and provides the domain name 222 to apredictive model 230. In the example data flow 200, the predictive model230 has been trained to determine whether domain names arealgorithmically generated using syntactic features of valid—e.g.,non-algorithmically generated—domain names 234 and algorithmicallygenerated domain names 236. As discussed above, the syntactic featuresused to train the predictive model 230 may vary, and may include a countof particular character n-grams, such as impossible English bigramsand/or trigrams, included in at least a portion of the query domainname, where n>1. Impossible English bigrams may, by way of example,include the following bigrams, which do not occur within a singleEnglish word: bk, bq, bx, cb, cf, cg, cj, cp, cv, cw, cx, dx, fk, fq,fv, fx, fz, gq, gv, gx, hk, hv, hx, hz, iy, jb, jc, jd, jf, jg, jh, jk,jl, jm, jn, jp, jq, jr, js, jt, jv, jw, jx, jy, jz, kq, kv, kx, kz, lq,lx, mg, mj, mq, mx, mz, pq, pv, px, qb, qc, qd, qe, qf, qg, qh, qj, qk,ql, qm, qn, qo, qp, qr, qs, qt, qv, qw, qx, qy, qz, sx, sz, tq, tx, vb,vc, vd, vf, vg, vh, vj, vk, vm, vn, vp, vq, vt, vw, vx, vz, wq, wv, wx,wz, xb, xg, xj, xk, xv, xz, yq, yv, yz, zb, zc, zg, zh, zj, zn, zq, zr,zs, zx. The predictive model 230, in response to receiving the domainname 222 as input, provides output 232 to the domain name analysisdevice 220, the output 232 indicating whether the domain name 222 wasalgorithmically generated.

In the example data flow 200, the domain name analysis device 220provides the domain name 222 to one or more additional predictivemodel(s) 240. In some implementations, each other predictive model 240may be separately trained to identify a particular DGA used to generatethe domain name 222. For example, each of the predictive model(s) 240may be trained using domain names previously known to have beengenerated by a particular DGA, e.g., Algorithm A domain names 244 may beused to train one of the predictive models 240 to determine whether adomain name was generated using Algorithm A, while Algorithm X domainnames 246 may be used to train a different predictive model to determinewhether a domain name was generated using Algorithm X. Providing thedomain name to the additional predictive model(s) 240 may, in someimplementations, be performed in response to the output 232 from thefirst predictive model 230 indicating the domain name 222 isalgorithmically generated. The output 242 provided by any of thepredictive model(s) 240 may indicate whether the query domain name 222was generated by the DGA used to train that predictive model.

In some implementations, the predictive model(s) 240 are a singlepredictive model trained to determine which DGA, of multiple DGAs, wereused to generate the query domain name 222. For example, a predictivemodel 240 may be trained to determine, based on syntactic features ofthe domain name 222, which DGA was most likely used to generate thedomain name 222 and provide data indicating that DGA as output 242. Inimplementations where measures of confidence are used to indicate thelikelihood that a domain name is algorithmically generated and/or toindicate the likelihood that a domain name was generated by a particularDGA, the predictive model(s) 240 may produce output indicating themeasure(s) of likelihood associated with each DGA. For example, if asingle predictive model 240 is trained to determine the likelihood thatthe domain name 222 belongs to one of 10 DGAs or an unknown DGA, thepredictive model 240 may provide, as output, a ranked list of the mostlikely DGAs used to generate the domain name 222 with a measure ofconfidence for each DGA in the list, including a measure of confidencefor an unknown DGA, if applicable.

While the example data flow 200 depicts multiple predictive models beingused to determine whether a DGA was used to generate the domain name 222and to determine which particular DGA was used, in some implementationsa single predictive model may be trained to identify which particularDGA is used, implicitly determining that a DGA was used. For example, asingle predictive model may be trained using multiple sets of data,e.g., a set of domain names known to be valid/non-algorithmicallygenerated, and multiple other sets of domain names that each includedomain names algorithmically generated by a particular DGA. Accordingly,the single predictive model may, for example, provide output indicatingthat a given domain name is either valid or generated by a particularDGA.

The predictive model 230 and other predictive model(s) 240, whiledepicted separately from the domain name analysis device 220 in theexample data flow 200, may be included in a single device 225, such as asingle intermediary network device or server computer. For example, thesingle device 225, may be the domain name analysis device, whichincludes the predictive model 230 and other predictive model(s) 240. Thepredictive models may, in some implementations, be implemented inseparate computing devices, e.g., each model may be implemented by aseparate computing device, each in communication with the domain nameanalysis device 220. In some implementations, the predictive model 230and/or other predictive models 240 may be trained by the same device orby separate devices. For example, predictive models may be separatelytrained by one or more computing devices and provided to a singlenetwork intrusion prevention device, such as device 225, for performingthe identification of algorithmically generated domain names using thepredictive models.

While only a single client device 210 is depicted in the example dataflow 200, multiple client devices may provide DNS query packets to thedomain name analysis device 220, and the client device 210 may providemore than one DNS query packet. In response to identifying analgorithmically generated domain, the domain name analysis device 220may take a variety of actions. For example, the domain name analysisdevice 220 may log occurrences of algorithmically generated domainsand/or provide data indicating the domain, identified DGA, and/or clientdevice identifying information to another entity, such as a securityevent manager, network administrator computing device(s), the clientdevice 210, and/or an entity that manages the client device 210. Inaddition, a domain name analysis device 220 implemented in anintermediary network device, such as an intrusion prevention system, mayblock the DNS query packets that include an algorithmically generateddomain and/or block further traffic from the client device that providedthe DNS query packet.

FIG. 3 is a flowchart of an example method 300 for identifyingalgorithmically generated domains. The method may be implemented by acomputing device, such as computing device 100 described above withreference to FIG. 1. The method may also be implemented by the circuitryof a programmable hardware processor, such as a field-programmable gatearray (FPGA) and/or an application-specific integrated circuit (ASIC).Combinations of one or more of the foregoing processors may also be usedto identify algorithmically generated domains.

A DNS query packet that includes a domain name is received (302). Forexample, an intermediary network device may capture DNS query packets asthey flow through a network. A hardware processor, such as an FPGA, maybe configured to extract domain names from each DNS query packet thatflows through the intermediary network device.

The hardware processor determines, based on syntactic features of thequery domain name, whether the query domain name is an algorithmicallygenerated domain name (304). The syntactic features include a count ofparticular character n-grams included in at least a portion of the querydomain name, where n is a positive integer greater than one. Inimplementations where a programmable hardware processor is used, thesyntactic features used to determine whether a domain name isalgorithmically generated may depend upon the configuration of theprogrammable hardware processor, e.g., impossible English bigrams may behardcoded into a particular FPGA configuration along with any othersyntactic features used to identify algorithmically generated domains,e.g., using regular expressions to match syntactic features of the querydomain name that are indicative of DGA use. Updating or otherwisechanging the configuration may result in different features being usedto identify algorithmically generated domains.

In some implementations, determining whether the query domain name isalgorithmically generated includes determining which DGA was used togenerate the query domain name. In implementations where a programmablehardware processor is used, for example, the programmable logic of aparticular processor configuration may define which syntactic featurescorrespond to which DGAs.

In some implementations, multiple hardware processors may be used todetermine whether a domain name is algorithmically generated and/ordetermine which DGA was used to generate the domain name. For example,multiple FPGAs may be used, each with a different configuration, e.g.,DGAs may each be associated with a particular FPGA that is configured toidentify the use of the associated DGA based on syntactic features thatmay be different for the different DGAs. One set of syntactic featuresused to determine if a particular DGA was used may be different from asecond set of syntactic features used to determine if another DGA wasused. A variety of processors, programmable or otherwise, and a varietyof syntactic features, may be used to determine whether a given domainname is algorithmically generated and/or which DGA was used to generatethe given domain name.

In some implementations, the determination of whether the query domainname is algorithmically generated is based on output received from apredictive model. For example, as described in FIGS. 1-2 above, apredictive model may be trained to determine whether a DGA was used togenerate a particular domain and, in some implementations, to determinewhich DGA was used. In implementations where one or more predictivemodels are used to determine DGA use, the predictive models may beimplemented in computing devices, programmable hardware processors,and/or a combination thereof.

The hardware processor provides output indicating whether the querydomain name is algorithmically generated (306). In implementations wherethe hardware processor determines which DGA was used, the outputindicates which DGA was used. As noted above, multiple processors may beused to process each domain name, and in these situations each processormay provide output, e.g., indicating whether a particular DGA was usedto generate the processed domain name. In situations where multipleprocessors are used, domain names may be processed serially, or inparallel, and the output of each processor may, in some implementations,include a measure of likelihood that the domain name was generated usingthe DGA associated with the processor.

In some implementations, the output may be used in a variety of ways.For example, the output may be logged for later analysis by the hardwareprocessor and/or another entity or device. The DNS query packet thatincludes an algorithmically generated domain may be blocked, and furthernetwork communications from the source client device may be blockede.g., in situations where the hardware processor is implemented in anin-line network intrusion prevention device. The output may also beprovided to another entity, device, or process, such as a security eventmanager, administrative client device, the client device that providedthe DNS query, and/or an entity that manages the client device.

The foregoing disclosure describes a number of example implementationsfor identifying algorithmically generated domains. As detailed above,examples provide a mechanism for identifying DGA use based on syntacticfeatures of domain names and potential applications of a system that iscapable of identifying algorithmically generated domains.

We claim:
 1. A non-transitory machine-readable storage medium encodedwith instructions executable by a hardware processor of a computingdevice for identifying algorithmically generated domains, themachine-readable storage medium comprising instructions to cause thehardware processor to: receive a first set of domain names, each domainname in the first set being a domain name that was previously identifiedas valid; receive a second set of domain names, each domain name in thesecond set being a domain name that was previously identified asalgorithmically generated; and train, using the first set and secondset, a predictive model to identify a given domain name as one of avalid domain name or an algorithmically generated domain name based on aplurality of syntactic features, the plurality of syntactic featuresincluding a count of particular character n-grams included in at least aportion of the query domain name, where n is a positive integer greaterthan one.
 2. The storage medium of claim 1, wherein the instructionsfurther cause the hardware processor to: receive a query domain name;provide the query domain name to the predictive model as input; andreceive, as output from the predictive model, a prediction specifyingthat the query domain name is one of i) a valid domain name, or ii) analgorithmically generated domain name.
 3. The storage medium of claim 2,wherein, responsive to the output specifying that the query domain nameis an algorithmically generated domain name, the instructions furthercause the hardware processor to: provide a third party computing devicewith data indicating that the query domain name is algorithmicallygenerated.
 4. The storage medium of claim 2, wherein each query domainname is received by: receiving a DNS query packet; and extracting, fromthe DNS query packet, the query domain name.
 5. The storage medium ofclaim 1, wherein: each domain name in the second set was previouslyidentified as being generated by one of a plurality of domain namegeneration algorithms; and identifying the given domain name as analgorithmically generated domain name includes identifying the givendomain name as being generated by one of the plurality of domain namegeneration algorithms.
 6. The storage medium of claim 1, wherein theinstructions further cause the hardware processor to: obtain a third setof domain names, each domain name in the third set being a domain namethat was previously identified as being generated by a particular domainname generation algorithm; and train, using the third set, a secondpredictive model to determine whether a particular domain name wasgenerated by the particular domain name generation algorithm, thedetermination being based on at least one of the plurality of syntacticfeatures.
 7. A computing device for identifying algorithmicallygenerated domains, the computing device comprising: a hardwareprocessor; and a data storage device storing instructions that, whenexecuted by the hardware processor, cause the hardware processor to:receive a query domain name; provide the query domain name as input to apredictive model that has been trained to determine whether the querydomain name is an algorithmically generated domain name, thedetermination being based on syntactic features of the query domainname, the syntactic features including a count of particular charactern-grams included in at least a portion of the query domain name, where nis a positive integer greater than one; and receive, as output from thepredictive model, data indicating whether the query domain name isalgorithmically generated.
 8. The computing device of claim 7, whereinthe instructions further cause the hardware processor to: provide thequery domain name as input to a second predictive model that has beentrained to determine whether the query domain name was generated by aparticular domain name generation algorithm, the determination beingbased on at least one of the syntactic features of the query domainname; and receive, as output from the second predictive model, dataindicating whether the query domain name was generated by the particulardomain name generation algorithm.
 9. The computing device of claim 7,wherein the data indicating whether the query domain name isalgorithmically generated specifies that the query domain name wasgenerated by a particular domain name generation algorithm.
 10. Thecomputing device of claim 7, wherein the instructions further cause thehardware processor to: provide the query domain name as input to asecond predictive model that has been trained to determine which domainname generation algorithm of a plurality of domain name generationalgorithms was used to generate the query domain name, the determinationbeing based on at least one of the syntactic features of the querydomain name; and receive, as output from the second predictive model,data indicating one of the plurality of domain name generationalgorithms.
 11. The computing device of claim 7, wherein each querydomain name is received by: receiving a DNS query packet; andextracting, from the DNS query packet, the query domain name
 12. Thecomputing device of claim 7, wherein the data indicating whether thequery domain name is algorithmically generated specifies a measure oflikelihood that the domain name is algorithmically generated.
 13. Thecomputing device of claim 10, wherein the data indicating one of theplurality of domain name generation algorithms specifies, for each of atleast one of the plurality of domain name generation algorithms, ameasure of likelihood that the query domain name was generated by thedomain name generation algorithm.
 14. A method for identifyingalgorithmically generated domains, implemented by a hardware processor,the method comprising: receiving, from a client device, a domain namesystem (DNS) query packet, the DNS query packet including a query domainname; determining whether the query domain name is an algorithmicallygenerated domain name, the determination being based on syntacticfeatures of the query domain name, the syntactic features including acount of particular character n-grams included in at least a portion ofthe query domain name, where n is a positive integer greater than one;and providing output indicating whether the query domain name isalgorithmically generated.
 15. The method of claim 14, wherein:determining whether the query domain name is an algorithmicallygenerated domain name comprises determining which domain generationalgorithm (DGA) of a plurality of DGAs was used to generate the querydomain name, and the output indicates which of the plurality of DGAs wasused to generate the query domain name.